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College teachers* self-ratings were investigated in 
this study by comparing them to ratings given by students. The sample 
consisted of 343 teaching faculty from five colleges; these teachers, 
as well as the students in one of their classes, responded to 21-item 
instructional report questionnaire. Correlating teacher responses to 
each item with the mean class responses (across the 343 classes) 
disclosed a modest relationship between the two sets of evaluation; a 
median correlation of .21 for the items. In addition to the general 
lack of agreement between self- and student evaluations, there was 
also a tendency for teachers as a group to give themselves better 
ratings than their students did. Comparisons between student and 
faculty responses were also made across items, and a rank correlation 
of .77 indicated a good deal of similarity in the way the two groups 
remk ordered the items. Discrepancies between individual teacher 
ratings and ratings given by the class were further analyzed for: (a) 

sex of the teacher (no difference found) , (b) nuB^er of years of 
teaching experience (no difference), and (c) subject area of the 
course (differences noted for natural science courses vs. those in 
education and applied areas) . Among other conclusions, the results of 
this study would argue for the collection of student ratings to 
supplement self ratings. (Author) 
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SELF-RATINGS OF COLLEGE TEACHERS: A COMPARISON WITH STUDENT RATINGS 

John A. Centra 
Educational Testing Service 

Abstract 

College teachers' self-ratings were investigated in this study by 
comparing them to ratings given by students. The sample consisted of 3^3 
teaching faculty from five colleges; these teachers, as v;ell as the students 
in one of their classes, responded to a 21 item instructional report 
questionnaire. 

Correlating teacher responses to each item with the mean class 
responses (across the 3^3 classes) disclosed a modest relationship oetween 
the two sets of evaluation: a median correlation of .21 for the items. 

In addition to the general lack of agreement between self and student 
evaluations, there was also a tendency for teachers as a group to give 
themselves better ratings than their students did. Comparisons between 
student and faculty responses were also made across items, and a rank 

j 

correlation of .77 indicated a good deal of similarity in the way the two 
groups rank ordertJd the items. 

Discrepancies between individual teacher ratings and ratings given by 
the class were further analyzed for: (a) sex of the teacher (no difference 

found); (b) number of years of teaching experience (no difference); and (c) 
subject area of the course (differences noted for natural science courses 
vs. those in education and applied areas). 

/unong other conclusions, the results of this study would argue for 
the collection of student ratings to supplement self-ratings. 
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Teacher self-ratings have been proposed as a ] 50 ssible source of 
information for performance improvenic-nt iind, to a lesser extent, as an 
inout into performance evaluation. As a basis fo'r decisions on promooion 
or salary , self-evaluations are not likely to have much validity. Hut 
it is possible that some form of systetmitic self-evaluation could be 
helpful to the teacher trying to improve instruction, particularly if 
combined with external evaluations provided by students or colleagues. 

There has been little research on teacher self-ratings. In particular, 
the relationship between self-ratings and those provided by students or . 

colleagues is not yet fully known. With 51 instructors in a military 
setting, Webb and Holan (1955) reported a correlation of .62 between 
instructor self-ratings and student ratings. Clark and Blackburn (l9Tl), 
however, reported a correlation of .19 between student ratings and faculty 
self-ratings at a small college, and a similarly moderate correlation (. 28 ) 
between self-ratings and colleague ratings; In both of these studies, 
overall teaciiing was rated rather than specific instructional practices. 

The purpose of this study was to further investigate college teachers' 
self-ratings and ratings given by students by comparing these two sets of 
ratings over a wide range of specific, student— oriented instructional 
oractices . Discrepancies between self— ratings (or self— descriptions ) and 
those provided by students would underscore the need for student feedback 
to the instructor as well ;as highlight specific areas of instruction where 
feedback is most essential. Differences in ratings will also be studied 
to investigate their relationships to selected teacher and course characteristics 



Procedure 

The sample for the study consisted of 3^3 teaching faculty at five 
institutions of higher education. Between 75 to 90 per cent of the teachers 
invited from each college participated in the study. The five institutions 
included two state colleges (one of which had a predominantly black enroll- 
ment), a selective liberal arts college, a multipurpose college, and an 
urban community college. None of these institutions had, at the time of 
the study, a systematic program to collect student ratings, nor did a 
significant portion of their faculty collect student ratings on their own. 
The majority of teachers in this study, therefore, were not familiar with 
how students might rate their instruction. 

Students and teachers responded to 21 items dealing with instructional 
practices. The student questionnaire was titled the "Midsemester Student 
Instructional Report" and actually contained 23 items, 21 of which were 
judged appropriate for instructor self-ratings. Included were items that 
faculty members in an earlier study had identified as providing information 
they would like to receive from students (Centra, 1972). Among the 
dimensions of instruction included were the organization of the course, 
student-teacher interaction, instructor communication, student effort, and 
stimulation of students. Previous factor analytic studies had identified 
several of these as dimensions that effectively differentiated among 
instructors (Coffman, 195^*; Gibb, 1955; Hodgson, 1958; Isaacson, McKeachie, 
^'1ilholland , Lin, Hofellcr, Baerwalt, & Zinn, 196lt). 

Responses to 17 of the items were on a four-point agree-disagree scale, 
with a "not applicable" option also provided. The four remaining items 
used a four- or five-point scale with different response options for each 



item. The wording for each of the statement.-3 in the questionnaire 
differed f.liRhtly for students and instructors. Fpr example, an item on 
course objectives was worded as foilows for each group: 

For students: Tlie instructor's objectives have been made clear 

For teachers: I feel my objectives for the course have been made 

clear to students 

Teachers were asked to "describe this course, your teaching, or the 
students enrolled." They were told that the reason for obtaining this 
self-repoi't was to see which items were tapping information already known 
to most instructors. 

The data were collected at midsemester of the Fall 1971 term. Instruc- 
tors administered the rating form in one class of their ovn choosing, with 
the understanding that only they would receive a summary of their students' 
responses . 

Analyses 

Faculty-student comparisons were made in a number of ways. First, 
the relationship between the two sets of ratings was studied by correlating 
instructor responses to each of the 21 items with the mean responses of 
students in their class (N = 3^3 classes). Secondly, differences between 
the way faculty as a group and students as a group rated or described 
instruction were investigated by a comparison of means; i.e., the mean 
score for all teachers on each item was compared to the average of the 
student class means . 

Finally, the discrepancy between each instructor's response and the 
mean response of his class was of particular interest. Tiie extent of that 
discrepancy and its relationship with specific teacher or course variables 



-i(- , 

\ 

(i.e., sex, years of teaching experience, subject area of the course) 
were analysed through multivariate analysis of variance. 

Results and Discussion 

Tlie results of the compjirison of means and the correlational analysis 
for items 5-21 are presented in Table !• The correlation between the two 
sets of descriptions or ratings was not particularly high, indicating only 
modest agreement in the way faculty and students perceived instruction. 

\^ile the correlation between faculty and student responses was significantly 
different from zero for most of the ite;;:^s due to the large U (3^*3), the 
median correlation was only ,21. 

Insert Table 1 about here 

Also listed in Table 1 are the mean faculty responses for each item 
and a ranking of the items, the mean of the classroom (student) means and 
a ranking of those scores, the results of the t-tests, and the nximber of 
colleges where the difference between the means was significant, A 
graphical presentation of tlic data ir. presented in Figure 1, Responses 
for items 5-21 could range from one for "strongly agree" to four for 

Insert Figure 1 about here 

"strongly disagree"; thus, lower values represent greater agreement with 
each statement. The comparisons of the mean values indicate that instructors 
as a group generally rated or described their teaching more favorably than 
did their students .(Students* ratings were also skewed toward the more 
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favorablc end oT the scale, which is usually the case with this type of 
instrument.) In particular, instructors ‘ and students did not agree on the 
following items: the extent to which students are free to ask questions 

or give opinions in class (item 1^), the extent to which instructors are 
concerned with student learning (ll), the amount ol agreement between 
objectives and what is being taught (6), instructor openness to other 
viewpoints (20), the extent to which instructors inform students of how 
they would be evaluated (l6), whether the instructor encourages students 
to think for themselves (lO), and the clarity of course objectives (5). 

f 

For each of these seven items, instructor-student differences were 
notable at either four or all five of the colleges. 

On the other hand, there was little difference between the faculty 
and student groups in their ratings of the instructor preparation for 
class ( 15 ) and on the extent to vrhich course objeccives were being accom- 
plished ( 21 ). For the remaining eight items, the differences were modest 
and in many instances not significant within a college. 

Another way to look at the data is to compare items with each other. 

The question then becomes : To what extent do the groups of teachers and 
students order the items similarly? A ranking of item means for each of the 
two groups indicates fairly high similarityj in fact, a rank correlation 
of .77. This would suggest that, while teachers and students are generally 
using different points on the scale in responding to the items (as 
indicated by the comparison of means), both groups tend to see the same 
relative strengths and weaknesses among the teachers in this study. For 
example, while there is a large mean difference between the groups on 
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instructor concern with student learning (item 11 ), both groups rated 
instructors favorabiy on this item in comparison to otlier aspects of 
teaching . Keeping in mind that higher scores represent unfavorable 
(disagree) responses, both groups also rated the instructors in th5.s study 
poorly on stimulating student interest in the course (].8). 

Generally speaking, combining the ranks of both teachers and students 
indicates that not stimulating student interest enough (l6) , the lack 
of helpful coniments on papers or exams (12), and not knowing when students 
understand the material tended to be rated as the most freq.uent criticisms 
^-of instruction for the teachers in this study. On the other hand, their 
strengths were in allowing students to feel free to ask questions or give 
opinions (ll)) and in their concern vrith student learning (ll). - 

Individual Teacher-Class Differences 

Probably more important than a comparison of the way an average 
instructor and an average class rated instruction is some knowledge of how 
many instructors perceived themselves far differently than their students 
did. A distribution of the differences between each instructor's 
responses and those of his class (i.e., the class means) provides that 
information. Presented in Table 2 is a summary of the results of such a 
distribution. For each item, the percentage of instructors who gave them- 
selves "considerably poorer" or "considerably better" ratings is indicated 
within each college and for the total sample. A difference of .63 or 
greater was used to define "considerably poorer or better" because a 
difference of at least that great would appear to be large enough to 
have some practical significance j it is also the approximate standatd 
deviation for most of the student item responses. 
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Tnsert Table 2 about here 



For most or the items, between a fourth and a third of the instructors 
described or rated themselves considerably better than their students did. 

The median, in fact, was just under 30 per cent for all 3^13 instructors 
and their classes. Forty-one per cent of the instructors gave themselves 
better ratings on item l^i : students are free to ask questions or give 

opinions in class; and 36 per cent on item 11: the instructor is concerned 

about whether students learn and tries to be actively helpful. Both items 
deal with faculty-student interaction as do .items 8, 9» 10, and l6 for 
which fairly high percentages of instructors also gave themselves better 
ratings. The faculty-student interaction dimension, then, appears to be 
one on which a sizable number of instructors and their students do not 
agree and on which student reactions would appear to be especially crucial. 
Other similar areas would be the instructor's openness to other viewpoints 
(item 20) and the agreement between announced objectives for the course 
and what was being taught (6). 

A surprisingly largo percentage of instructors rated themselves poorer 
than students did in a few areas. Fifteen per cent rated themselves more 
ooorly on class preparation and 12 per cent were less satisfied that they 
were accomplishing course objectives. In general, however, only between 
U to 8 per cent of the teachers gave themselves considerably poorer ratings. 

One of the items in the form was unique in that it elicited opinions 
on student effort in the course (l9). For students, the exact wording was: 

"I have been putting a good deal of effort into this course"; for instructors 
it was worded; "Students seem to be putting a good deal of effort into this 
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oourae." The results for this item, as one inip;ht expccu, were much dilTereril 
than those for othei- items, Compared to students' responses, 18 per cent 
of the faculty thought students generally were putting considerably less 
effort into the course, while 10 per cent gave students better ratings on 
effort than students gave themselves. In other words, in this instance 
students have tended to give themselves better ratings jur-t as instructorss 
did on so many of the previous items. 

An inspection of the differences within each college indicates fairly 
similar results with the exception of college five. In comparison to the 
other four colleges, highei’ percentages of the instructors at col lege 
five rated themselves considerably better than did their students on a 
majority of the items. V/hile it is not possible to conclude much on the 
basis of one college, it is interesting to note that college five was the 
smallest and most selective of the colleges in the study. Mc’-eover, in- 



structors at college five were given the poorest student ratings among the 
five colleges, whereas their self-ratings Mere not much different or poorer 
than those of instructors elsewhere. Tlius , the gap between instructor- 
student ratings at college five was due largely to the poorer ratings by 
students., perhaps because of higher expectations on their part, rather 
than on better ratings by instructors. 

Presented in Table 3 is a suiamary of responses to the first four items, 
which used varied responses rather than agree-disagree options. The items 
deal with the pace, the level of difficulty, and the work load of the 
course, as well as the extent to which the instructor used examples and 
illustrations. Once again there v/ere student-instructor differences although 
they were not particularly large. Instructors tended to think they more 



oTten used examples and illust nations , and at thvee of tnc collef'cs 
instructors more likelj'^ considered i.he pace at whic'n material vras covered 
to be slow. College five, the selective liberal arts college,: was once 
again noteworthy in that its faculty and to some extent the students 
reported less frequent use of extunples or illustrations in courses. 



Insert Table 3 about here 



A final question regarding individual teacher-class differences was 
whether those differences were related to instructors of different sexes, 
with varying amounts of teaching experiences, or those teaching riifieren'c 
subject areas. Are the se—f— ratings ' for female teachers, for example^ 
more similar to their students' rating? than are those of male teachers? 

For this fuialysis, each course was grouped into ,One of four general subject 
area categories : natural sciences, humanities, social sciences, and 

education and applied subjects (e.g., business, home economics, nursing). 
Teaching exoerience consisted of three categories: one or two ye<-.iSi, ^^hree 

to six years, and seven years or more. Data for 235 teachers were available 
for this analysis. 

The results of the multivariate analysis, of variance, in which all 
21 items were used as variables , are given in Table There were no 
differences due to sex or years of teaching CAperience or for any of the 
interactions; there was, however, a significant difference (p < .35) 
due to subject area. This difference was largely between natural science 
courses and those in education and applied subjects. Specifically, teachers 
in the natural sciences did not think the pace of the course was as fast 
as their students said it was , and they did not think students put as much 
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effort into the course as students said they did. Conversely, teachers 
in education and applied subjects reported the course as having a faster 
pace than their students reported, and thought that students put more 
effort into the course than students said they did. 

Insert Table It about here 



Summary and Conclusions 

A comparison of students' ratings of instruction with teachers' self- 
reported ratings in over 30C classes .at five colleges disclosed a modest 
relationship between the two sets of evaluations. The median correlation 
for IT items was .21, indicating that faculty members generally evaluate 
or describe their teaching somewhat differently from, the way it is 
evaluated or described by their students. Hot surprisingly, the highest 
correlations occurred for the more factual items, on which there was some- 
what less chance for disagreement (e.g., the instructor informs students 
of hovf they would be evaluated), while items eliciting opinions (e.g., 
the instructor is using class time w'ell) resulted in the lowest correlation.!. 

As mentioned earlier, previous studies, in vrhich students and facul cv 
ratings of instruction had been compared, employed a single overall measure 
of teaching and produced conflicting results: .62 in one instance (webb S: 

Holan, 1955 ) and .19 in the other (Clark & Blackburn, 1971 ). The latter 
correlation was repoi’ted for college teachers and, of course, was fairly similar 
to the median correlation for the IT ite.ms used in the five-college study 
reported here. Webb and Nolan's use of instructors in a military setting 
may explain the unusually high correlation found in their study; in any 
event, it does not seem to apply to more typical college teaching situations. 
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In addition to the r^eneral lack of agreement between self and student 
evaluations, there was also a tendency for teachers as a group to give 
themselves better ratings than their students did* In a sense this 
tendency might be viewed as only "human,** or certainly not surprising* 

As Robert Burns has reminded us, most people do not see themselves as others 
see them; teachers and the way they see their instruction are apparently 
no exception* 

Comparisons between student and faculty responses were also made 
across items, and a rank correlation of *77 indicated a good deal of 
similarity in the v;ay the two groups ran}; ordered the items* This suggests 
that instructors are indeed aware of many of their particular teaching 
strengths and weaknesses, even though they see themselves more favorably 
in absolute tenns* They are also probably more aware of their own relative 
strengths and weaknesses than they are of the way they might compare to 
other instructors, as suggested by the previously cited correlational dat,a 
for each item* An ipsative approach to student rating of faculty, there- 
fore, in which the emphasis is on identifying the specific **good** and **bad** 
practices of each individual teacher, would not appear to be as informative 
to instructors as the normative approach, in which comparisons may be made 
with other relevant groups of instructors* 

The discrepancy between individual teacher ratings and the mean rating 
given by his class was most notable for between a fourth to a third of the 
3^3 instructors in the study, and in particular for items related to student- 
instructor interaction, course objectives, and the instructor’s openness 
to other viewpoints. These areas of instruction, then, would seem to be 
particular ones in which a sizable proportion of teachers could profit from 



student feedback* 
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Teacher-nluclent discrepancies v^ere about the same for men and women 
teachers and for the more and less experienced teachers, Tiiat there were 
no sex dilTerences in rating discrepancies is not particularly surprising; 
but one might have predicted that the self-ratings of more experienced 
teachers would be closer to student ratings. Since most of the teachers 
in this study had not made a practice of obtaining systematic feedback 
from their students, the findings suggest that getT^ing to know student 
reactions to teaching is not something that comes merely with experience. 

Of particular interest, iiowever, \ieve differential discrepancies 
noted for the subject areas; teachers of natural science subjects under-:- 
estimated (relative to their students) both the pace of their course and 
their students* efforts, while teachers of education and applied subjects 
overestimated tiie course pace and their students* efforts. These subject 
area differences night be explained by the differences in the content and 
in the intended objectives of courses in each area. Instructors of 
mathematics, physics, biology, and the like may feel that there is so much 
factual and theoretical material to cover in their courses that a fast 
pace coupled with :i good deal of student effort is a necessity. V.Hiat 
teachers in the natural sciences view as an acceptable pace and v;ork load, 
however, apparently does not coincide with their students, who frequently 
are using courses in other fields for comparison. In education and applied 
subject areas, not only might the amount of factual material be less 
demanding on students, but frequently the major objectives of the courses 
are to establish particular attitudes or skills with students, V/orking 
toward those objectives may result in courses that appear slower paced to 



students 
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In conclusion tVie results of this study would lU’Rue for the collection 
of student ratings as a means of providing instructors with informatjon 
they do not already have about their teaching. As an aid to instructional 
iitiprover'.sn t , teacher self-ratings might in fact be used in conjunction wit,."! 
student feedback as a means of highlighting discrepancies for the individual 
instru.^tor. 
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Tfibio 3 

Ffirulty-oliuienl Co^aparisonn at Five Collene:: and Total (U s 3^3), 
for Four Itomn in IriGtructional Hcport. Questionnaire 

a 

. Percentage* Reopondirifn 

nt.udents Faculty 

College Collep-o 

1 2 3 U 3 Total 12 3'*^ Total 



1 Pace at which nateriai 
is covered: 

Very or somewhat slow 9 ^ 

Very or somewha.t fast *36 20 2i 

2 Level of uifriculi^y of 
course for students 
enrolled : 

Very or somewhat elementary' 11 13 10 

Very or somewhat difficult 31 25 32 

3 VJork load of course rela- 
tive to others: 

Lighter ^ ^ 

Heavier ^ ^ 
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23 
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t Extent to which eMruiiples 
and illustrations were 
used : 

Frequently 

Occasionally 

Seldom 

Never 



Go TO T6 67 58 6? 

28 26 20 26 3!* 26 

10 1 * I 4 6 8 6 

2 1111 1 



88 75 86 82 65 60 

12 21 II4 18 32 19 

0 2 0 0 3 1 

0 2 0 0 0 1 



'^For items 1-3, the foui* responses 
middle response ("about right" or "about 



have been collapsed into two catCf^ories ^ the 
the same") is not shown. 
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Table !t 



Summary of HAiJOVA Rcssults of Instructor-Class Differences 
by Sex, Subject Area, and ‘himber of Years Teaching 

(H = 235) 



r-’ouroe " 


df 

Hypothesis 


df 

Error 
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P ^ 


Sex 


21 


IS '2 


.3)4 


.99 


Years of Teaching 


1)2 


381 ) 


1.09 


.314 


Subject Area 


63 


5Tli 


1.33 


.05 


Sex X Years Teaching 


1)2 


381 ) 


.86 


.72 


Sex X Subject Area 


63 


5Tl< 


.62 


.99 


Years Teachinr; x Subject Area 


126 


1121 


.65 


.89 



^The trinle-order interaction v;as not run because one of the cells v;as blank 





Fig. 1. Facility and student mean responses to items in in- 



structional report . 
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